Instance-based Sentence Boundary Determination by Optimization for Natural Language Generation
نویسندگان
چکیده
This paper describes a novel instancebased sentence boundary determination method for natural language generation that optimizes a set of criteria based on examples in a corpus. Compared to existing sentence boundary determination approaches, our work offers three significant contributions. First, our approach provides a general domain independent framework that effectively addresses sentence boundary determination by balancing a comprehensive set of sentence complexity and quality related constraints. Second, our approach can simulate the characteristics and the style of naturally occurring sentences in an application domain since our solutions are optimized based on their similarities to examples in a corpus. Third, our approach can adapt easily to suit a natural language generation system’s capability by balancing the strengths and weaknesses of its subcomponents (e.g. its aggregation and referring expression generation capability). Our final evaluation shows that the proposed method results in significantly better sentence generation outcomes than a widely adopted approach.
منابع مشابه
Generating Natural Sentences by Using Shallow Discourse Information
One of the biggest defects of natural language generation systems is that the output sentences are unnatural and contain many redundancies. Machine translation (MT) users, for instance, often get tired of reading the output of MT because of this problem. In this paper, we summarize the results of our analysis of human translation in terms of the use of discourse information to generate target-l...
متن کاملSentence boundary detection using sequential dependency analysis combined with CRF-based chunking
In spoken language, sentence boundaries are much less explicit than in written language. Since conventional natural language processing (NLP) techniques are generally designed assuming the sentence boundaries are already given, it is crucial to detect the boundaries accurately for applying such NLP techniques to spoken language. Classification frameworks, such as Support Vector Machines (SVMs) ...
متن کاملBuilding applied natural language generation systems
In this article, we give an overview of Natural Language Generation (nlg) from an applied system-building perspective. The article includes a discussion of when nlg techniques should be used; suggestions for carrying out requirements analyses; and a description of the basic nlg tasks of content determination, discourse planning, sentence aggregation, lexicalization, referring expression generat...
متن کاملGenerate Compressed Sentences with Stanford Typed Dependencies towards Abstractive Summarization
In this paper, we implement sentence generation process towards generate abstractive summarization which is proposed by (Genest and Lapalme, 2010). We simply use Stanford Typed Dependencies1 to extract information items and generate multiple compressed sentences via Natural Language Generation engine. Then we follow LexRank based sentence ranking combined with greedy sentence selection to build...
متن کاملOptimizing question answering systems by Accelerated Particle Swarm Optimization (APSO)
One of the most important research areas in natural language processing is Question Answering Systems (QASs). Existing search engines, with Google at the top, have many remarkable capabilities. But there is a basic limitation (search engines do not have deduction capability), a capability which a QAS is expected to have. In this perspective, a search engine may be viewed as a semi-mechanized QA...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005